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ABSTRACT 

Fast Independent Component Analysis (FastICA) is a component separation algorithm based on the levels of non-Gaussianity. 
Here we apply the FastICA to the component separation problem of the microwave background including carbon monoxide 
(CO) line emissions that are found to contaminate the PLANCK High Frequency Instrument (HFI) data. Specifically we prepare 
100GHz, 143GHz, and 217GHz mock microwave sky maps including galactic thermal dust, NANTEN CO line, and the Cosmic 
Microwave Background (CMB) emissions, and then estimate the independent components based on the kurtosis. We find that 
the FastICA can successfully estimate the CO component as the first independent component in our deflection algorithm as its 
distribution has the largest degree of non-Gaussianity among the components. By subtracting the CO and the dust components 
from the original sky maps, we will be able to make an unbiased estimate of the cosmological CMB angular power spectrum. 
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1. INTRODUCTION 

Precise measurements of the Cosmic Microwave Back- 
ground (CMB) anisotropies have been a powerful probe 
into the early uni verse and co s molog y. Experiments 
COBE dBennett et al. 



such as 



BOOMERanG 



dMacTavish e7ai1l2006l) . WMAP (iKomatsu etalJl2011h have 
already placed strong constraints on the parameters of the 
cosmological model, such as the age of the universe, baryon 
and cold dark matter densi ties and so on. The third generation 
CMB satellite, PLANCK dTauber et al]l2010h . is expected to 
release its cosmological results soon which will include con- 
straints on the amplitude of primordial gravitational waves, 
the spectral index and its running of the primordial curvature 
perturbations, the amount of primordial non-Gaussianity and 
so on, and thereby will give constraints on physics of the 
early universe such as inflation. 

Such cosmological information can be obtained only when 
sources of uncertainty are removed successfully. With recent 
high-resolution and sensitive instruments, the main source 
of uncertainty is the contamination by foreground emissions 
from the Galaxy, rather than the instrumental noise itself. 
Therefore component separation methods have been progres- 
sively developed so far based on the analyses of data at dif- 
ferent frequencies and different dependencies on frequency 
of the astroph y sical e mission laws (for a recent review, see 

iDunk lev et al. ( 2009)); T he methods include template fit- 

ting dEfstathiou etaLll2009l: iKatavama & Komats dl20TIh . In- 
ternal Linear Combination (Eriksen et al. 12004 . Correlated 
Component A nalysis (B onaldi et al. 2006|), Maximum En- 
tropy Method dHobson et al.ll 19981) and so on, where the dif- 
ferences are in the way how they model the data and the as- 
sumptions made on the foreground components. 

Along with the synchrotron and thermal dust emissions 
which constitute substantial portion of the foreground com- 
ponent of the Galaxy at microwave frequencies, it now be- 
comes clear that the rotational transitions of carbon monox- 
ide (C O) significantly contaminate the PL ANCK observing 
bands dPlanck HFI Core Team et al.1l201 ll) . In particular, the 
frequencies of the lowest two rotational transitions of CO, 
namely J=(l-0) and J=(2-l), are at the first and third transmis- 
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sion bands of PLANCK 's High Frequency Instrument (HFI), 
that is, at 100 and 217 GHz bands. Therefore we must de- 
velop a method to remove this contribution for cosmological 
analysis. 

A simple way to remove such contribution may be to use 
a template with help fro m the other CO lin e surveys, such 
as Columbia survey fe.g.. lDame et al.l (120011) ) and NANTEN 
Galactic Plane Surv ey (NGPS) survey by NA NTEN telescope 
(lOnishi et alJl200H iMizuno & Fukuj 120041) . However, these 
surveys are dedicated mainly to the Galactic plane where the 
most of the molecular clouds has been found and there has 
been no full sky CO map to be compared with the PLANCK 
full sky data. Therefore it will be helpful to find a method 
to obtain (even rough) information about the distribution of 
the CO molecular clouds, especially at high galactic latitudes, 
from the PLANCK data alone. 

To this end we consider a fast component separation method 
to extract the CO distribution based on the Independen t Com- 
ponent Analysis (FastICA) dHwarinen & Ojal 119971) . The 
FastICA has several advantages in comparison with the meth- 
ods mentioned above; among them the most important point 
is that FastICA method needs no prior assumptions about 
the distribution of the foreground components and their fre- 
quency dependences (therefore it is called as Blind Source 
Separation in the statistics community). Instead the method 
uses statistical independency to separate the components as 
described below shortly. Therefore the method may be suit- 
able to estimate the distribution at the current moment when 
we do not know much about the CO distribution on the full 
sky and its relative contributions to the PLANCK observ- 
ing bands. Several applications of the FastICA to the CMB 
component separation problem can be found i n the litera- 
ture, which includes ap plications to the COB E (iMaino et al.l 
I20Q2L 120031) BEAST rfDonzelli et al l l2006h WMAP data 
(Maino et al. 2007; B ot tino etaLl 120 1 0|) and simulated 21 cm 
maps (Chapman et al. 2012). Here we first apply the method 
to extract the CO component assuming the PLANCK HFI and 
evaluate its performance by means of Monte Carlo simula- 
tions, in terms of the angular power spectrum of temperature 
anisotropies which is one of the most important statistic for 
cosmological analysis. 

2. FASTICA METHOD 

ICA assumes that observed maps are given by a superposi- 
tion of independent astrophysical components and the cosmo- 
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Fig. 1 . — Angular power spectra of CO line emissions at the 100GHz band 
in the MBM and Pegasus region observed by the NANTEN telescope with 
estimated error magnitud e (red dashe d line) . 217GHz signal is estimated 
assuming the line ratio of Ingalls et al. (2000) (blue dotted line). CO angular 
power spectrum gives more significant contribution at smaller angular scales 
relative to the cosmological signal (black line). Thanks to the high resolution 
of the NANTEN telescope the noise level is negligible in our study (magenta 
dash-dotted line). 



logical CMB. The ICA model is given by 
T j {h i ) = M{S\h i ), 



(1) 



where T j (nt) are the observed temperatures with the y'-th band 
at the sky direction hi, M J k is the mixing matrix, and S k with 
k = 1 , 2, 3 are the three independent sources considered in this 
paper that correspond to CO, thermal dust and CMB emis- 
sions, respectively. In our simulation we consider PLANCK 's 
j = 100, 143 and 217 GHz bands. The ICA algorithm esti- 
mates the sources and the mixing matrix simultaneously by 
maximizing the degree of non-Gaussianity of the variable Y k 
with the matrix Wf, 



Y k (h i ) = WfTi(h i ). 



(2) 



Naively, this process is equivalent to maximizing "indepen- 
dency" between the variables Y k because of the central limit 
theorem. When the non-Gaussianity of Y k takes the maxi- 
mum, W should be M~ l and Y k approaches S k . In the present 
analysis we consider a noisy ICA model which is given by 



T\h i )=M j k s k (h i )+N j (h i ), 



(3) 



where instrumental (white ) noise terms N j are taken f r om th e 
PLANCK specification dPlanck Collaboration etaLl 120111) . 
Because the noise level of NANTEN telescope is not signifi- 
cant compared to that of the PLANCK as shown in Fig. [TJ so 
we have neglected it. 

In what follows, we use the vector notation and write T in- 
stead of T-i, etc., for clarity. Following the standard ICA pro- 
cedure, we first quasi- whiten the observed data . The whiten- 
ing is done by the operations (iHy varinenl 1 9991) 

x(n i ) = (T(h i )-f) , (4) 

x(h i ) = (C-VT 1 2x(fi i ), (5) 

± = (C-Il)-*I1(C-Iiy* , (6) 

where T = E{T} is the mean of temperature in each observed 
band, C = E{xx T } is the covariance matrix of the observed 



data, £ = E{NN T } is the known noise covariance matrix, and 
5] is that after quasi- whitening. The ensemble average E{} is 
estimated from the sample average from the observed pixels 
hi. Thus, the problem is recasted to finding a matrix W which 
maximizes the levels of non-Gaussianity of the variables y = 
Wx. 

In the analysis we adopt the deflation algorithm, i.e., we es- 
timate the independent components one-by-one by maximiz- 
ing the non-Gaussianity of the variable yifii) = w T x(hi), under 
a constraint |^| 2 = 1. In this case, the vector w T is a row of the 
matrix W. To find w T which maximizes the non-Gaussianity 
of the variable y, we need an evaluation function of the level 
of non-Gaussianity. In the present analysis we use the kurtosis 
as the evaluation function g(y): 



g(y) = kurt(y) = E{y 4 }-3(E{y 2 }y 



(7) 



The function g(y) takes the minimum g(y) = when the vari- 
able y is Gaussian distributed. The gradient of the kurtosis is 
given by 

=4E{(w T x) 3 x}-12w T (i + s) w (j+s) w , (8) 

where we have used the fact that E {xx T } = / + £ . In the 
standard gradient method, the parameter w should be updated 
by 

Awoc (x E{(w T x) 3 x} - 3w T (^1 + w (^1 + w . (9) 

Because we have restricted the parameter space by the con- 
straint \w\ 2 = 1, the vector w should satisfy the condition 
w oc Aw at the stable point. Therefore one obtains the fixed 
point algorithm (iHyvarinen & Ojalfl997h . 

w new =E{(w T x) 3 x}-3w T (j + s) w (j + s) w , (10) 

which is followed by a normalization w nQW w ne w/|^w|- 
The above procedure maximizes the non-Gaussianity of the 
variable y = w T x in terms of the kurtosis. 

When we need to estimate more than one independent com- 
ponent, we can repeat the above procedure. In this case, an or- 
thogonalization step must be operated at every iteration before 
the normalization step to prevent the different vectors 
from converging to the same vector, where (n) means the ft-th 
independent component. The orthogonalization can be done, 
for example, by Gram-Schmidt-like decorrelation method: 



w (p+V = w (p+V. 



(/?+!) . W U)^ W U) 



(11) 



3. SKY MODEL 



In Fig. [3] we show simulated sky maps at the 100, 143, and 
217 GHz PLANCK bands including the CMB, foreground 
components and the PLANCK instrumental noises (T j in 
Eq.(l)). For the CMB component we generate random Gaus- 
sian skies with the angular power spectrum co nsistent with 
the WMAP 7 year results Larson et all 120111) . The skies 
are convolved with a spherical beam with the largest beam 
width among the three bands, namely, FWHM = 9.37 ar c-min 
at the 100 GHz band dPlanck Collaboration et aT1l201 ll) . For 
the instrumental noises we assume white noises for about 14 
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Fig. 2. — CO line emission intensity at the MBM and Pegasus region ob- 
served by NANTEN telescope. The unobserved (masked) region is shown in 
blue. 

months observation with the amplitudes 

AT^K/pix] = Ar[/i^/beam] 

C'pix 

(2.80(100GHz) 
= <^ 1.79 (143GHz) (12) 
U.55(217GHz) , 

where 9 V [ X = 6.87 arc-min (Gaussian) with the HEALPIX pa- 
rameter A^ S ide = 512 (G orski et al .120051) . The noise terms N j in 
Eq.Q are realized randomly from normal distributions with 
variances given by Eq. (fT2l) . 

For the foreground components we assume galactic thermal 
dust and CO line emissions. In particular, thermal dust emis- 
sions and CO line contaminations are prominent at 217 GHz 
and the 100 and 217 GHz bands, respectively. For the ther- 
mal du st emissions we follow the "model 8" of Finkbeiner 
et al. (Finkbeine r et al.l [19991) . which gives predictions of 
dust maps at microwave f requencies through extrapolations 
from SchlegelitaD ( [19981). The same mode l maps are imple- 
mented in a recent paper (ISehgal et al.ll2010l) . and slightly dif- 
ferent maps ("model 7") are used in the PLANCK Sky Model 
(iDelabrouille et al.l 120121) . For the CO line contamination at 
the 100 GHz band we use real data at MBM and Pegasus re- 
gion o bserved by NANTEN telescope (lYamamoto et al.l2003L 
120061) and convert the NANTEN velocity-integrated intensity 
map to the CMB temperature map at 100 GHz band by multi- 
plying the conversion factor o/ y 1 " = T^ GUz /I^}-° = 14.2 
found by the PLANCK team dPlanck HFI Core Team etaD 
120111) . The FWHM of the NANTEN beam is about 2.6 arcmin 
and we smooth the map to 9.37 arcmin by using the subrou- 
tine alteralm in the HEALPix facilities in order to match the 
FWHM of the PLANCK 100GHz band. While the CO line 
conta mination at the 143 GHz band is fo und not to be signif- 
icant dPlanck HFI Core Team et al.1l201 ll) we must taken into 
consideration the contamination at the 217 GHz band where 




Fig. 3.— Simulated CMB sky maps at 100GHz (top), 143GHz (middle), 
and 217GHz (bottom). 
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the transition J=(2-l) comes in. Because NANTEN data are 
not available for this transition we make a simple assumption 
that the intensity of J=(2-l) transition is proportional to that 
of J=(l-0). 

Specifically, we make a toy sky map for the CO line emis- 
sion at 217 GHz by 



Jl-0 



1-0 a J 2 . 



(13) 



Here the integrated line ratio, R J j~^ - 0.77 ± 0.24, is taken 
from lln galls et"aTI (120001) . in which they estimated the ratio 
between J=(4-3), (2-1) and (1-0) at high galactic molecular 
clouds based on the observations using the Antarctic Sub- 
millimeter Telescope, Remote Observatory and the Five Col- 
lege Radio Astronomy Observatory. 

Because the CO line emission data is limited to the MBM 
and Pegasus region (/ S k y ^0.8 %) shown in Fig. [2] we con- 
centrate our analysis only for this region. 

3.1. Foreground Components in the CMB angular power 
spectrum 

We depict the angular power spectra in Fig. [4] showing 
the impact of the foreground components at the MBM and 
Pegasus region and instrumental noises on the power spec- 
tra. To estimate the power spectra we mask all the pixels 
outsi de the MBM and Pegasus region and use the Polspice 
code (ISzapudi et al.ll200 it IChon et al.ll2004|) . Errors are esti- 
mated by generating five hundred mock CMB and instrumen- 
tal white noise maps as described earlier. Because the mask 
covers most part of the sky (/ S ky ^ 0.8 % for the MBM and 
Pegasus region) and we thus have large cosmic variance errors 
with correlations between neighboring multipoles, we bin the 
spectra with the bandwidth A£ = 25. 

It has been known that the thermal dust component has 
larger power at higher frequencie s and affected the spectrum 
mainl y at larger angular scales dTegmark et al.l l2000l iMasil 
12004 . We find that this holds true for the MBM and Pega- 
sus region considered here. On the other hand, we find that 
the CO component gives significant contaminations at smaller 
angular scales. The contamination can be larger than the in- 
strumental and cosmic variance errors at t > 900 and £ > 400 
at the 100 and 217 GHz bands, respectively. Galactic syn- 
chrotron emissions are found to be always subdominant in 
those frequencies and multipole range because the MBM and 
Pegasus region is far away from the galactic disk, and we have 
omitted in our current analysis. 

4. RESULTS 

4.1. Foreground components estimated by the FastICA 

To the sky maps prepared in the previous section we apply 
the FastICA algorithm in order to estimate the CO contribu- 
tion and subtract it from the maps. As described earlier, we 
use the sky maps at three frequency bands to separate out three 
independent components based on the kurtosis. 

In Fig. [6l the three sources obtained from the ICA algo- 
rithm are shown. We find that in our deflection algorithm, the 
algorithm always finds the CO-like component as the first in- 
dependent component (S 1 ), irrespective of the initial condition 
for the vector w. This is caused by the fact that the distribu- 
tion of the CO line intensity has the largest non-Gaussianity 
in terms of the kurtosis among the three component (CO, dust 
and CMB). It is also evident from the figure that the second 
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Fig. 4. — The binned power spectra and error bars estimated from 500 
Monte-Carlo simulations, with foreground components and instrumental 
noises. For clarity, the positions of the bins for 100 and 217 GHz are shifted 
by Al = ±5, respectively. The CO components affect the spectrum at small 
angular scales at the 100 and 217 GHz bands, while the thermal dust compo- 
nent dominates at large angular scales at the 217 GHz band. 
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Fig. 5. — The fractional error in the estimated CO intensity at each pixel 
at the 100GHz band. Here the CO is assumed to be the first independent 
component. While the errors can be large at pixels in which the CO intensities 
are intrinsically small as Tco < 50/iK, the errors are less than 10% in pixels 
with large temperature in this particular realization. 

independent component S 2 is most responsible for the thermal 
dust component given by the dust model of Finkbeiner et al.. 

In order to investigate the performance of the FastICA 
method as an estimator of the CO component we make a scat- 
ter plot as shown in Fig. [5] In the figure we show the intensity 
of the first independent component estimated by the FastICA 
at each pixel against the input CO intensity. We find that the 
accuracy depends on realizations; namely, on particular CMB 
and noise realizations on which the CO emissions are super- 
imposed. Overall, for most of the realizations the errors can 
be less than < 30 % at pixels where the CO emissions are 
strong, while the estimated intensity has large scatter where 
the CO intensity is intrinsically small. 

Because the performance depends on realizations, we again 
use the Monte-Carlo simulations described earlier to rate the 
performance statistically as discussed below. 

4.2. Angular Power Spectrum Estimation 
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To rate statistically the performance of the FastICA method 
which depends on realizations, here we apply FastICA 
method to all the simulated CMB maps, estimate the fore- 
ground components to be removed, and calculate the CMB 
angular power spectrum for each realization. The results are 
shown in Fig. |7] In this figure we show the power spectrum 
of the CMB component that is estimated as the third indepen- 
dent component. Clearly, it is found that the additional power 
due to the galactic foregrounds found in Fig. [4]is successfully 
removed from all the frequency bands by the ICA, and the 
input CMB angular power spectrum is recovered within their 
error bars. 

Interestingly, we find that the error bars are almost unaf- 
fected by this procedure. The errors become larger only about 
< 10 % at the 217 GHz band where the foreground contami- 
nation is the largest, while at the 100 and 143 GHz bands the 
error bars in the estimations of the CMB power spectra have 
almost the same magnitude as those before the foreground re- 
moval. The 10% increase of the errors can be considered as 
uncertainties in the FastICA method. 

In the bottom panel of Fig. [7] we again depict the band 
powers normalized by the input CMB angular power spec- 
trum. An acoustic-like structure in the estimated bias is seen; 
the power is underestimated around the acoustic peaks and 
overestimated around the dips. This is partly because we have 
binned the power spectrum with an equal weighting in the 
l(l+\)Ci space. Note however that the bias is well within the 
error bars. 

5. SUMMARY AND DISCUSSION 

In this paper we considered the CMB foreground subtrac- 
tion problem, paying particular attention to the lowest two ro- 
tational transitions of CO molecule J=(l-0) and J=(2-l) that 
contaminate the PLANCK 100 GHz and 217 GHz bands, re- 
spectively. Firstly, we estimated the angular power spectrum 
of the CO line emissions at the MBM and Pegasus region ob- 
served by NANTEN telescope, and found that the CO line 
emissions have significant contribution to the angular power 
spectrum especially at small angular scales (£ > 900 and 400 
for the 100 and 217 GHz bands, respectively.) 

The CO contamination, if it is not to be taken into account 
correctly, will cause a wrong estimation of cosmological pa- 
rameters. In particular, the parameters related to the primor- 
dial fluctuation amplitude such as the amplitude of the curva- 
ture perturbation and its spectral index will be significantly af- 
fected. Indeed, we had found that even for a small MBM and 
Pegasus region the bias about the estimations of these param- 
eters are beyond the la error bars, toward larger amplitude 
and larger spectral index. We should stress, however, that this 
result holds only for the MBM and Pegasus region of the sky. 
It will be a future issue how large is the CO contamination to 
the estimation of the full sky CMB angular power spectrum. 

Secondly, we applied the FastICA to the Foreground sub- 
traction problem including the CO line emissions. The Fas- 
tICA algorithm can separate the components based on the in- 
dependency of the components or equivalently the level of 
non-Gaussianity, without any prior knowledge of distribution 
and frequency dependence of each foreground component. 
We find that CO-like component is extracted as the first in- 
dependent component in our deflection algorithm as the CO 
distribution has the largest non-Gaussianity among the com- 
ponents considered here. This fact can be used to estimate 
quickly the CO component in the PLANCK data. 

Based on the Monte Carlo simulations including CO and 
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Fig. 6. — Independent components (S 1 ^ 2 , and S 3 ) obtained from the ICA 
algorithm. The source with the largest non-Gaussianity is shown in the top 
panel and the smallest in the bottom. It is clear that the ICA successfully 
estimates the CO distribution (S 1 ) in the top panel. The second source seems 
to be thermal dust emissions. The amplitudes are arbitrary because of the 
degeneracy between the sources and mixing matrix components. 
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Fig. 7. — The binned power spectrum and error bars estimated from 500 
Monte-Carlo simulations (top) and fractional errors (bottom). For clarity, 
the positions of the bins for 100 and 217 GHz are shifted by A£ = ±5, re- 
spectively. Acoustic structure in the estimated bias is seen; the power is un- 
derestimated around the acoustic peaks and overestimated around the dips. 
However, the bias is well within the error bars. 



thermal dust emissions as foregrounds, we investigated how 
the CMB is recovered in terms of the power spectrum. 
Though the accuracy depends on the particular realization of 
the instrumental noises and CMB, we found that the recovery 
is very well in statistical sense. The success is thanks to the 
approximate statistical independence between the foreground 
components like CO and background CMB. This is consis- 
tent with the result in the earlier literature where the authors 
applied the FastICA method to the WMAP data and found 
that it can recover the CMB angular power spectrum consis- 
tent with the spectrum i ndependently derived by the WMAP 
team (iMaino et alJl2Q07h . 

Finally, we should comment on the impact of the FastICA 
method on the level of non-Gaussianity in the estimated CMB 
map. Because the method relies on the non-Gaussianity to 
estimate the independent components, naively it should af- 
fect the non-Gaussianity of the CMB which probably has the 
smallest degree of non-Gaussianity among the components in 



the microwave sky. In Fig. [8] we show the kurtoses in the esti- 
mated CMB (red) and CMB+Foreground (blue) maps against 
the value in the input CMB maps at the 217 GHz band in the 
MBM and Pegasus region. Clearly, it is seen that the bulk of 
the kurtosis which comes from the thermal dust and CO com- 
ponents is removed through the method. Interestingly, we find 
that some portion of the kurtosis in the CMB maps (which 
should be zero in the mean in our Gaussian simulation) is re- 
covered with a scatter about 20% when the kurtosis has large 
value (|kurt| > 0.2). However, the accuracy depends on the 
size of the kurtosis, and the method induces a false signal of 
the kurtosis in the estimated CMB map when the input kurto- 
sis is too small. We leave this issue for future investigation. 



3 
2.5 

2 
1.5 

1 

0.5 




++ + + + + + + + + 




o 

H 



O 
CD 



Fig. 8. — (Top) Kurtoses in the estimated CMB (red) and 
CMB+Foreground (blue) maps at the 217 GHz band. The straight 
line represents linear relation. (Bottom) Fractional differences between the 
input and estimated kurtoses. 

In conclusion, in this paper we found that the FastICA can 
efficiently extract the CO line foregrounds that contaminate 
the PLANCK HFI bands. The method will be useful to esti- 
mate the CO distribution in the real PLANCK data, and any 
foreground component whose distribution is not known in ad- 
vance in the future CMB experiments. 
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